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(57) Abstract 



The hPMS2 gene encodes a protein which is involved in DNA mismatch repair and is mutated in a subset of patients with hereditary 
nonpolyposis colon cancer (HNPCC). The previously published hPMS2 cDNA sequence lacks an upstream in-frame stop codon preceding 
the presumptive initiating methionine. To further evaluate die 5* terminus of the hPMS2 coding region, we isolated additional cDNA 
clones, RT-PCR products, and the corresponding 5* genomic segment of the hPMS2 locus. The hPMS2 gene transcripts were found to 
have heterogeneous but col linear V termini, one of which contained an in-frame termination codon preceding the initiating methionine. In 
addition, a gene encoding a 34.5 kDa polypeptide was found to transcriptionally initiate within hPMS2 from the opposite strand. 
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peptides from the 85 kDa protein revealed it to be the product of hMLHl, and this 
protein's molecular weight agreed with that predicted Atom the cDNA sequence 
(Bronner eLaL, 1994; Papadopoulos etal., 1994). The sequence of the peptide 
generated from the 110 kDa component showed it to be similar to the hPMSZ 
mutL-homoiog; however, the predicted molecular weight of hPMS2 is only 95 kDa 
(Nicolaidcs, et.aL, 1994). Since the previously isolated hPMS2 cDNA clones 
lacked an in-frame termination codon upstream of the presumptive iniriarf^g 
methionine, it was posable that the open reading frame extended further upstream. 
Thus there is a need in the an for further knowledge of the genetic structures of 
and adjacent to the known KPMS2 gene. 
SUMMARY OF THE PSJVENTION 

It is an object of the invention to provide a novel, is o la te d, human gene on 
chromosome 7. 

It is an object of the invention to provide vectors and host cells for making 
a novel human gene product 

It is another object of the invention to provide compositions of matter 
containing the human gene product. 

These and other objects are provided by one or more of the embodiments 
described below. In me embodiment of the invention, a vgrnmr of cDNA is 
provided. The cDNA consists of the sequence of nucleotides shown in Figure 2. 

According to another embodiment of the invention, a vector comprising the 
segment of cDNA which consists of the sequence of nucleotides shown in Figure 
2 is provided,' as well as host cells comprising the vector. 

According to still another embodiment of . the invention, a composition is 
provided. The composition consists essentially of a protein consisting of the amino 
acid sequence shown in Figure 2 

In yet another embodiment of the invention a composition of protein JTV1 
as shown in Figure 1 is provided. The composition is free of other human 
proteins. 
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In another embodiment of the invention a segment of cDNA is provided 
which segment encodes the amino acid sequence of JTV1 protein shown in 
Figure 2. 

cDNA probes are also provided by the present invention. The cDNA 
portion of said probes consists of between IS and 1176 contiguous nucleotides of 
the sequence shown in SEQ ID NO:l. 
BRIEF DESCR IPTION OF THE DRAWINGS 

Figure 1 shows the sequence of the 5' region of hPMS2 and predicted 
coding region. The arrow indicates the 5' end of the previously published cDNA 
clone. The pr es ump t i ve initiating methionine is underlined. 

Figure 2 shows the se q ue nc e of J7T7. The sequence has been deposited 
in Genbank, accession number U24169. The presumptive initiating methionine is 
underlined. 

Figure 3 demonstrates the genomic localization of 77V7. The genomic 
localization of hPMSZ and JTV7 were confirmed by screening somatic-cell hybrids 
containing various regions of human chromosome 7. Lane 1, GM10791 contains 
entire chromosome 7 in a chinree hamster ovary (CHO) background; lane 2, 
NA 11440 contains 7pter>7p22 in a CHO background; lane 3 f Rn-Rag4-13 
contains 7cen-7pter in a murine background; lane 4, 4AF1/106/K015 contains 
7cen-qter in a marine background; lane 5, GM05184.17 contains 7q21.2-qter in 
a CHO ba ckgrou nd: lane 6 f 2068Rag22-2 contains 7q22-qter in a murine 
background; lane 7, human genomic DNA; lane 8, mouse genomic DNA; lane 9, 
CHO genomic DNA. 

Figure 4 demonstrates the mapping of transcriptional start sites of hPMS2 
and JTVL Sequence of the genomic region containing the 5' ends of the two 
genes is shown. The sequence is numbered in respect to codon 1 of hPAfS2. 
Lower case letters denote intronic sequence of JTV1 (from nr. -479 to -833) and 
hPMS2 (from +24 to +108). Arrows indicate the 5' ends of HPMS2 (sense 
strand) and iTTVl (andsense strand) cDNA clones. The underlined ATG codons 
indicate the predicted initiating methionines for hPMS2 (at nt + i on the sense 
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strand) and JTVl (at nt -345 on the antisense strand). The sequence has been 
deposited in Genbank. accession number U24168. 

Figure 5 shows the expression of HPMS2 and JTVL RNA from various 
tissues was incubated with reverse transcriptase (RT+) or in control reactions 
without reverse transcriptase (RT-). The cDNA was used as template for PGR 
with primers specific for hPMS2 (A) and JTVl (B). RT-PCR products were 
separated by poiyacryiamide gel electrophoresis. 

DETATTim nESCRIPTTON OF THE PREFERRED EMBODIMENTS 

To investigate the upstream region from hPMS2, we isolated additional 
cDNA clones, analyzed the 5* end of hPMS2 transcripts with PGR-based 
techniques, and cloned the corresponding genomic segments. In addition to 
clarifying the transcript, we sexendipitously discovered a previously undeschbed 
gene overlapping hPMS2. That gene is termed herein JTV1. The sequences of the 
JTVl cDNA and protein are shown in SEQ ID NOSrl and 2, respectively. 

A segment of cDNA according to the p r es ent invention refers to a 
contiguous str e tch of deoxyribonucleotides which have a sequence as obtained upon 
reverse transcriptase of an RNA transcript. Such segments do not contain introns. 
The segment may be an isolated molecule or it can be covalently joined to other 
nucleic acid sequences. The segment may, for example, be replicated as part of 
a vector, such as a plasmid, virus, or minichromosome. The vector may be 
replicated within a host cell, such as a ceil transformed by a recombinant DNA 
molecule. The host cell may be used to produce JTVl protein. It can also be 
used to study regulation of expression of JTVl sequences, for example by 
subjecting the host cell to various agents which may or may not affect the 
expression. Although the DNA sequence is discussed with particularity herein, it 
is well within the skill of the an to make small mutations, such as single nucleic 
acid substitutions of one of the other three nucleic acid bases, at any of the 
positions of the sequence. In addition, it is well within the art to make single base 
deletions or single base msenx ns, to study the effect upon protein structure and 
function. 
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If JTV1 is produced in a recombinant host ceil which is not human, a 
composition of JTV1 protein will be produced which is free of other human 
proteins. If JTV1 protein is isolated from naturally producing ceils, or from 
human host ceils, then the protein can be purified, for example, using antibodies 
which are raised against an immunogen comprising JTV1 amino acid sequence. 
Any other means of purification known ui the arc can be used, as is desired. 

DNA molecules can be made having different nucleotide sequences from 
that disclosed in SEQ ID NO:l, but which still encode the JTV1 protein as 
disclosed in SEQ ID NO:2. Using the known coding relationships between codons 
and amino acids and the disclosed amino acid sequence, numerous other sequences 
can be readily designed and produced. Such DNA molecules are within the 
contemplation of the subject invention. 

cDNA probes can be used for hybridization studies. Typically they are 
labeled with a detectable marker, such as a radiolabei or a fluorescent moiety, 
although they need not be. The cDNA probes of the subject invention consist of 
at least 15 contiguous nucleotides of the sequence shown in SEQ ID NO:l. If 
greater specificity is desired, larger molecules of 18, 20, 25, or 30 nucleotides can 
be used, up to a maximum of the entire sequence of 1176 nucleotides. 

J7V1 cDNAs can be used as probes to detect deletions in chromosome 7. 
Due to the overlapping promoter regions, large deletions of JTV1 would also be 
expected to affect PMS2 expression, leading to Hereditary Non-Polyposis 
C olor ectal Cancer (HNPCC). JTV1 cDNA can be used in chromosome mapping. 
It can also be used to assay activity or competence of the PMS2 promoter region. 
The p rese nce of JTV1 transcripts or JTV1 protein suggests that the PMS2 promoter 
is intact. If the PMS2 promoter is intact and PMS2 products are absent, a 
structural defect in the coding region is indicated. 

JTV] sequences can be used to guide homologous recombination at the 
PMS2 locus. For exampie, where a PMS2 mutation is present and therapeutic 
replacement with a wild-type gene is desired. PMS2 sequences can be used to 
provide an adjacent region of homology. Similarly, u may be desirable to target 
other genes to the region adjacent to PMS2. JTV1 sequences can be used to flank 
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such other genes, providing one or more regions of homology. If insertion of 
other genes is desired between the JTV1 and the PMS2 sequences, again, this can 
be accomplished using the identified sequences as homology units for homologous 
recombination. 

Examples 
Example 1 

Isolation and sequence analysis of the 5' end of hPMSl. 

Purified DNA from PI clone 53, previously determined to contain the 
hPMS2 gene (Nieolaides, eLaL, 1994), was digested with EcoRI and subcioned 
into the pBluescript vector (Stratagene). Clones containing the 5* region of hPMS2 
were identified by hybridization with primer A (Table 1) directed to exon 1. 
Restriction analysis of several positive clones showed them to be identical. The 
sequence of the relevant region of hPMS2 was determined from both strands using 
oc-dATP and Sequenase (USB). 
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Table 1. Primers used for hPMS2. 



PRIMER NAME 


STRAND 


PRIMER SEQUENCE 


POSITION* 


A 


sense 


5 1 - cgggtgttgcatccatgg-3' 


-14 - +4 


B 


sense 


S'-gggtggagcacaacgtcg -3* 


-110 - -93 


C 


sense 


5 ' -ggtcacgacggagaccg-3 f 


-283 - -267 [ 


D 


sense 


5 '-tgcaggtgggaagctccacacgg-3 v 


-414 - -392 j 


E 


sense 


5 f -tagctcctgccgtgcacg-3 * 


-448-^31 


F 


sense 


5 '-cgctcctacctgcacgtg-3 * 


-487 - -470 


G 


antisense 


5 9 -tagacteagtaccacctgc-3 • 


+90- + 107 


H 


sense 


5'-tacagaacctgctaaggcc-3* 


+24 - +42 


I 


antisense 




+116- +136 


J 




5'-caaccatgagacacatcgc-3' 


+2545- 


K 


antisense 


S , *aggttagtgaagactctgtc-3 f 


+2647- 
+2666 



* Relative to the presumptive initiating methionine in Figure 1. 

Three clones were isolated, each containing an 8,5 kb EcoRI insert Partial 
sequence analysis of one clone, pSMN, determined that it contained coding 
residues of hPMS2 as well as sequences upstream of the previously designated 
codon 1. The presumptive initiating codon reported previously 1 has been 
designated as nucleotide 1 in Figure 1. The sequence of HPMS2 was extended 833 
bp upstream of nucleotide 1. Thi5 sequence revealed an in-frame stop codon 321 
nts upstream of the published initiator methionine*., with no intervening methionines 
(Figure 1). 
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Example 2 

Isolation of additional cDNA clones using hPMS2 probes. 

Two cDNA libraries were screened with a probe containing nt +24 to 
+ 136 of hPMS2 generated by PCR using PI clone 53 as template and the primers 
H and I (Table 1). A human small intestine random-primed cDNA library in 
XGTIO (Clontech) and a HeLa oligo-dT primed cDNA library in XZAPII 
(Stxatagene) were screened as described except hybridizations were carried out at 
68°C and filters were washed at 65 °C for 0.5 hrs (Kinder and Vogelstein, 1989). 
Following plaque purification, the EcoRI inserts from the small intestine library 
were subcloned into pBluescript vector, while the HeLa cDNA inserts were 
rescued as phagemids following the manufacturer's protocol (Stratagene). 

One clone was isolated from the random-primed small intestine library, and 
this contained nt -14 to nt + 1668 of hPMS2. Two clones were isolated from the 
oligo-dT primed HeLa cDNA library. The clones began at nt -53 and ended at 
either nts +2722 or +2749. The HeLa cDNA library was also screened with a 
430 bp probe from the 5' genomic region of hPMS2 f containing nt -414 to +16, 
generated by PGR ficom PI clone 53 using primers D (Table 1) and O (Table 2). 
The same two clones were identified, as expected. However, twelve other 
overlapping clones were found and appeared to represent a different transcript, 
named JTV1 (Figure 2). These twelve cDNAs were approximately 1.2 kb in 
length and were s eque nc ed in their entirety. All twelve ended with a polyA tract 
(assumed to be the 3' end) and were identical for L2 kb upstream. The 5* ends 
were located within 38 bp of each other. Comparison with hPMS2 indicated that 
JTVl was transcribed from the opposite strand. 



WO 97/08312 PCT/US96/13598 

- 9 - 



Table 2. Primp .™ used for JTV-1 cDNA amplification. 



PRIMER NAME 


STRAND 


PRIMER SEQUENCE 


POSITION* 


L 


sense 


5 , -gttctgccatgccgatg-3 f 




M 


sense 


5 '-ggcctttggcacgcgctac-3 ' 


-23 - -41 


N 


sense 


5-accggactgcgttttcccg-3 ' 


-111 - -129 I 


I ° 


sense 


S v -tctcagctcgctccatgg-3* 


-343 - -360 1 


P 


antisense 


5'-gcagagacaggttagactc-3' 


+139 - +157 


Q 


sense 


5 1 -gctccttaagtgaattgccg-3 ' 


+952 - +971 | 


1 R 


antisense 


5 f -tgacacttgacaactggcc-3 f 


+1068- 
+1086 



* Relative to the presumptive initiating methionine in Figure 2. 



The length of one clone representative of J7V7 (pM23NNFL) was 1233 bp 
and encoded an open reading frame (ORF) of 936 bp (Figure 2). The first 
methionine within this ORF was designated codon 1 (Figure 2) and was preceded 
by an in-frame termination codon 66 bp upstream. This methionine had a 
reasonable match to the Kozak translation initiation consensus (Kozak, 1986). The 
3' end contained a polyadenylation signal (AAUAAA) starting at nucleotide 1086 
followed by a polyA tail. The transcript was predicted to encode a polypeptide of 
312 amino acids, with a molecular weight of 34.5 kda. Searches of nucleotide and 
peptide sequence databases showed that this was a novel gene, with limited 
homology to the glutathione S-transferase gene family. 
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Example 4 

Chromosnn^ Mapping of JTV1 

The hPMS2 locus was previously mapped to chromosome 7p22 by FISH 
using PI clone 53 (Nicolaides et.al., 1994). Because multiple APAfSS-related 
genes are located on the long arm of chromosome 7 and have conserved 5 f regions 
(personal observation. Hori et.aL, 1994), we confirmed the genomic localization 
of JTV1 by PCR analysis of rodent-human somatic cell hybrid DNAs containing 
various regions of chromosome 7 (Scherer etaL, 1993; Powers et.aL, 1993). 
PGR primers were chosen from the 3* untranslated region of hPAfS2 and JTV1 and 
shown to amplify genomic DNA. HPMS2 primers J and K yielded a 121 bp 
product and JTV1 primers Q and R yielded a 134 bp product. PCR products for 
both genes were formed in those DNAs containing the 7p22 region: lines 
GM10791 (containing the entire human chromosome 7), NA11440 (Coriell 
Institute) (7p22>7pter) and Ru-Rag4-13 (7cen-7pter) (figure 3, lanes 1, 2, and 3). 
No products were observed in lines 4AF1/106/K015 (7cen-qter), GM05184.17 
<7q2L2-qter), or 2G68Rag22-2 (7q22-qter) (figure 3, lanes 4, 5, and 6). 

Analysis of the ? Termini of hPMS2 and JTY1* 

The 5* termini of hPMS2 transcripts were studied by standard cDNA 
cloning, RACE, and RT-PCR analyses. RNA was purified from tissues and cells 
using a guanidine isothiocyanate based method (Chomcrynski and Sacchi P 1987). 
Reverse transcriptase-polymerase chain reaction (RT-PCR) was performed using 
randomly primed cDNA as template as described (Leach, etal., 1993). RT-PCR 
of the 5* end of hPMS2 was performed using a common antisense primer (I) and 
the sense primers (A-F) described in Table 1. RT-PCR mapping of the 5' end of 
JTVI was done using a common antisense primer P and the sense primers L-0 as 
described in Table 2. RACE (rapid amplification of cDNA ends, Frohrran. et.al., 
19°8) was performed on hPMS2 using sequential antisense primers I and G (Table 
1) following the manufacturer's protocol (Clontech). RACE analysis oiJTVl was 
done using the antisense primer P (Table 2). Amplification products were cloned 
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into a T-tailed vector (InVitrogen) and sequenced using SP6 and T7 primers. 
Amplifications were done at 95°C for 30 sec, 56°C for 1.5 min., and 70°C for 
1.5 min for 35 cycles. Reaction products were separated by electrophoresis in 6% 
nondenaturing polyacrylamide gels. 

Figure 4 shows the sequence of the genomic region containing the 
transcriptional initiation sites of both hPMS2 and J7VJ /numbered as in Figure 1 
with respect to hPMS2. The 5* ends of hPMS2 cDNA clones are marked with 
arrowheads on the top strand. One clone began at nt -14, one at nt -24, and two 
at nt -53. RACE products were generated from adult brain, leukocyte, and 
placenta mRNA. Using an antisense primer corresponding to nt +116 to +136, 
multiple bands with approximately 160 to 191 bps were observed in addition to 
less intense bands of up to 550 bp. The sequence of four cloned RACE products 
demonstrated that, as expected, their 5' ends were located between nt -25 to -55. 
These data suggested that the majority of hPMS2 transcripts initiated between nt - 
13 to -55, with a minority extending further upstream. This was confirmed by 
RT-PCR analysis using mRNA from HeLa cells as template. Robust RT-PCR 
products were amplified with sense primers whose 5' ends were at nt -14, -110, 
-283, and -414, (primers A, B, C, and D; Table 1) and an antisense primer 
corresponding to nt +90 to +107 (G). No PCR products were observed using 
sense primers whose 5' ends were at nt -448 or -487 (primers E and F). To 
ensure thai primers E and F were not defective, successful amplification of 
genomic DNA was performed using these primers and an antisense primer (O) 
corresponding to nt -2 to +16. 

The 5' termini of JTV1 showed a heterogeneous pattern like that of hPMS2. 
The 5' ends of the 12 cDNA clones are indicated by arrowheads on the bottom 
strand in figure 4. They were located 73 to 113 nt 73 upstream of codon 1 of 
JTV1, which corresponded to nt -271 to -232 of hPMS2. RACE confirmed the 
cDNA results in chat the majority o! products generated using an antisense primer 
P corresponding to JTV1 nt +157 were 230 to 270 bp. RT-PCR analysis was 
performed with antisense primer P and several sense primers (L-O) listed in Table 
2. PCR products were found with sense primers whose 5' ends were at -8, -23, 
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and -111, (primers L,M, and N) but not with a sense primer O whose 5' end was 
at nt -360 with respect to JTVl t nt +1. The latter primer was not defective, as 
a genomic segment could be successfully amplified with it. 

Transcripts of hPMS2 had heterogeneous but collinear 5' termini, 
containing 11 to 415 nt of presumably untranslated sequence. The transcripts 
contained an in-frame stop codon upstream of the presumptive initiating 
methionines (Figure 1), making the originally described methionine the most likely 
translation initiator. Because no other upstream coding regions of hPMS2 
appeared to exist, the size discrepancy between that predicted from the hPMS2 
sequence and the 110 kDa hPMS2 protein identified by Li and Modrich is likely 
due to post-transcriptional modifications or alternative internal exons. 

Our results revealed that HPMS2 overlaps with a novel gene, 7717, 
transcribed from the opposite strand (Figure 4). This organization is similar to 
that of HUMDUG, a murt-homolog found on human chromosome 5, and the 
dihydrofblate reductase (DHFR) gene (Fujii and Shimada, 1989). Both hPMS2- 
JTV1 and HUMDUG-DHFR lie in a head to head arrangement, both genes are; 
ubiquitously expressed, and both have multiple 5' termini. It has been 
hypothesized that DHFR and HUMDUG may be regulated via a bidirectional 
promoter, because a minor subset of the transcripts from the two genes overlap. 
The major transcripts of HUMDUG and DHFR, however, do not overlap, as is 
true for hPMS2 and JTV1. It will be of interest to determine whether other 
mismatch repair genes are arranged in a head to head fashion with a contiguous 
gene and if JTV1 is involved in DNA replication or repair. 

Example 6 

Expression of hPMS2 and T7VL 

The e x p res s ion of hPMS2 and JTV1 was analyzed in a variety of mRNA 
samples prepared from human tissues. RT-PCR was performed on cDNA 
templates derived from adult brain, leukocytes, kidney, large intestine, colon, 
salivary gland, lung, testes and prostate using primers J and K for HPMS2 and 



WO 97/08312 



- 13 - 



PCI7US96/I3598 



primers Q and R for JTV1 (Tables 1 and 2). Both genes were expressed in all 
tissues tested (Figure 5). 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Vogelstein, Bert 

Kinzler W. , Kenneth 
Nicolaides c, Nicholas 

<ii) TITLE OF INVENTION: Human JTV1 Gene Overlaps PMS2 Gene 

(111) NUMBER OF SEQUENCES t S 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Banner £ Allegretti, LTD* 
(3) STREET: 1001 G Street, NW 
(C) CITY: Washington DC 

(E) COUNTRY: U.S.A. 

(F) ZIP: 20001 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPES Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC— DOS /MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
<C> CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Kagan A. , Sarah 

(B) REGISTRATION NUMBER: 32,141 

(C) REFERENCE /DOCKET NUMBER: 1107.49697 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE : 202-508-9100 

(B) TELEFAX: 202-508-9299 



(2) INFORMATION FOR SEQ ID NO:l: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 base pairs 

(B) TYPE: nucleic acid 
(CV'STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(iii> HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

( ix ) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 46.-334 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TTACCTGCTA CATCCCCATC CCAGAACCAA AGCAAAACGG CGTAC CGC CTG CCA 54 

Arg Val Pro 
1 

AAG GCC AAC GCT CAG AAA COG TCA GAG GTC ACG ACG GAG ACC GGC CAC 102 
Lya Ala Asn Ala Gin Lya Pro Ser Glu Val Thr Thr Glu Thr Gly His 
5 10 15 

CTC OCT TCT GAC CCT GCT GCG GGC GTT CGG GAA AAC CCA GTC CGC TGT 150 
Leu Pro Ser Asp Pro Ala Ala Gly Val Arg Glu Asn Ala Val Arg Cys 
20 25 30 35 

GCT CTG ATT GGC CCA GGC TCT TTG ACG TCA CCA ACT CCA CCT TTG ACA 198 
Ala Leu lie Gly Pro Gly Ser Leu Thr Ser Arg Ser Arg Pro Leu Thr 
40 45 50 

GAG CCA ATA GGC GAA AAG GAG AGA CGG GAA GTA TTT TTG CCG CCC CGC 246 
Glu Pro He Gly Glu Lya Glu Arg Arg Glu Val Phe Leu Pro Pro Arg 
55 60 65 

COG GAA AGG GTG GAG CAC AAC GTC GAA AGC ACC CAA TGG GAG TTC AGG 294 
Pro Glu Arg val Glu His Asa Val Glu Ser Ser Gin Trp Glu Phe Arg 
70 75 80 

AGG CGG AGC GCC TGT GGG AGC CCT GGA GGG AAC TTT CCC ACT CCC CGA 342 
Arg Arg Ser Ala Cys Gly Ser Pro Gly Gly Asn Phe Pro Ser Pro Arg 
85 90 95 

GGC GGA TCO GGT GTT OCA TCC ATG GAG CGA GCT GAG AGC TOG 384 
Gly Gly ser Gly val Ala Ser Met Glu Arg Ala Glu Ser Ser 
100 10S 110 



(2) INFORMATION FOR SEQ ID NO:2t 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH i 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2; 

Arg Val Pro Lys Ala Asn Ala Gin Lys Pro Ser Glu Val Thr Thr Glu 
15 10 15 

Thr Gly His Leu Pro Ser Asp Pro Ala Ala Gly Val Arg Glu Asn Ala 
20 25 30 

Vai Arg Cy3 Ala Leu Iir Cly Pro Gly Ser Leu Thr Ser Arg Ser Arg 
35 40 45 

Pro Leu Thr Glu Pro He GJy Glu Lys Glu Arg Arg Glu Val Phe Leu 
SO SS 60 

Pr Pro Arg Pro Glu Arg Val Glu His Asn Val Glu Ser Ser Gin Trp 
65 70 75 80 



Glu Phe he j Arg Arg Ser Ala Cys Gly Ser Pro Gly Gly Asn Phe Pr 
85 90 95 
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Ser Pro Arg Gly Gly Ser Gly Val Ala Ser Met Glu Arg Ala Glu Ser 
100 105 110 



Ser 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1233 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cONA 

<iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE! 

(A) NAME/KEY: COS 

(B) LOCATION: 114.. 1049 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCGAACGCCC GGAGGAGGGT CAOAAGCGAG GTGCCCGCTC TCCCTCGTCA CCTCTGACGG 60 
TTTCTCAGCC TTCGCCTTTG GCAOCCCCTA CACCCTTTTG CTTTGGTTCT CCC ATG 116 

Met 

1 

CCG ATG TAC CAG GTA AAG CCC TAT CAC COG GGC GGC GCG CCT CTC COT 164 
Pro Met Tyr Gin Val Lye Pro Tyr His Gly Gly Gly Ala Pro Leu Arg 
5 10 IS 

GTG GAG CTT CCC ACC TGC ATG TAC CGG CTC CCC AAC GTG CAC GGC AGG 212 
Val Glu Leu Pro Thr Cys Met Tyr Arg Leu Pro Asn Val Hie Gly Arg 
20 25 30 

AGC TAC GGC CCA GCG CCG GGC CCT GGC CAC GTG CAG GAA GAG TCT AAC 260 
Ser Tyr Gly Pro Ala Pro Gly Ala Gly His Val Gin Glu Glu Ser Asn 
35 40 45 

CTG TCT CTG CAA. GCT CTT GAG TCC CGC CAA GAT GAT ATT TTA AAA CGT 308 
Leu Ser Leu Gin Ala Leu Glu Ser Arg Gin Asp Asp He Leu Lys Arg 
50 55 60 65 

CTG TAT GAG TTG AAA GCT GCA GTT GAT GGC CTC TCC AAG ATG ATT CAA 356 
Leu Tyr Glu Leu Lys Ala Ala Val Asp Gly Leu Ser Lys Met He Gin 
70 75 80 

ACA CCA GAT GCA GAC TTG GAT GTA ACC AAC ATA ATC CAA GCG GAT GAG 404 
Thr Pro Asp Ala Asp Leu Asp Val Thr Asn He He Gin Ala Asp Glu 
85 90 95 

CCC ACG ACT TTA ACC ACC AAT GCG CTC GAC TTG AAT TCA GTG CTT GGG 452 
Pro Thr Thr Leu Thr Thr Asn Ala Leu Asp Leu Asn Ser Val Leu Gly 
100 105 110 
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AAC CAT TAC CCC GCC CTG AAA CAC ATC GTG ATC AAC GCA AAC CCG GCC 500 
Lys Asp Tyr Gly Ala Leu Lys Asp lie Val lie Asn Ala Asn Pro Ala 
115 120 125 

TCC CCT CCC CTC TCC CTG CTT GTG CTG CAC AGO CTG CTC TGT GAG CAC 548 
Ser Pro Pro Leu Ser Leu Leu Val Leu His Arg Leu Leu Cys Glu His 
130 13S 140 145 

TTC AGG CTC CTC TCC ACG GTC CAC ACG CAC TCC TCG CTC AAG AGC GTG 596 
Phe Arg Val Leu Ser Thr Val His Thr His Ser Ser Val Lys Ser Val 
150 155 160 

CCT CAA AAC CTT CTC AAC TCC TTT GGA GAA CAC AAT AAA AAA GAG CCC 644 
Pro Glu Asn Leu Leu Lys Cys Phe Gly Glu Gin Asn Lys Lys Gin Pro 
165 170 175 

CGC CAA GAC TAT CAG CTG CCA TTC ACT TTA ATT TCG AAG AAT GTG CCG 692 
Arg Gin Asp Tyr Gin Leu Gly Phe Thr Leu lie Trp Lys Asn Val Pro 
180 185 190 

AAG ACG CAG ATG AAA TTC AGC ATC CAG ACQ ATG TCC CCC ATC GAA GGC 740 
Lys Thr Gin Met Lys Phe Ser He Gin Thr Met Cys Pro lie Glu Gly 
195 200 205 

GAA GGO AAC ATT GCA CGT TTC TTC TTC TCT CTG TTT GGC CAG AAG CAT 788 
Glu Gly Asn He Ala Arg Phe Leu Phe Ser Leu Phe Gly Gin Lys His 
210 215 220 225 

AAT CCT GTC AAC GCA ACC CTT ATA GAT AGC TOG CTA CAT ATT GCC ATT 836 
Asn Ala Val Asn Ala Thr Leu He Asp Ser Trp Val Asp He Ala tie 
230 235 240 

TTT CAG TTA AAA GAG GGA AGC ACT AAA GAA AAA GCC CCT GTT TTC CGC 884 
Phe Gin Leu Lys Glu Gly Ser Ser Lys Glu Lys Ala Ala Val Phe Arg 
245 250 255 

TCC ATG AAC TCT OCT CTT GGG AAG AGC CCT TGG CTC CCT GGG AAT GAA 932 
Ser Met Asn Ser Ala Leu Gly Lys Ser Pro Trp Leu Ala Gly Asn Glu 
260 265 270 

CTC ACC CTA GCA GAC GTG GTG CTG TGG TCT GTA CTC CAG CAG ATC GGA 980 
Leu Thr val Ala Asp Val .Val Leu Trp Ser Val Leu Gin Gin He Gly 
27S 280 285 

GGC TCC AGT GTG ACA GTG CCA GCC AAT GTG CAG AGC TGG ATG AGC TCT 1028 
Gly cys ser Val Thr Val Pro Ala Asn Val Gin Arg Trp Met Arg Ser 
290 m 295 300 305 

TCT CAA AAC CTG CCT CCT TTT TAACACGGCC CTCAAGCTCC TTAAGTGAAT 1079 
Cys Glu Asn Leu Ala Pro Phe 
310 

TCCCCTAACT GATTTTAAAG GGTTTAGATT TTAAGAATGG TICTCTTTCf. TGCCTATTM 1139 

CAGTAAGGGG ACTTGTATTA GAGTCAGAGT CTTTTTATTT AGG CCAGTTG TCAAGTGT^A 1199 

ATAAAAGCGC ATCATGTAAT TTAAAAAAAA AAAA 1233 



(2) INFORMATION FOR SEQ ID NOt4: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 312 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Pro Met Tyr Gin V&I Lyc Pro Tyr His Cly Gly Gly Ala Pro Leu 
1 S 10 15 

Arg Val Glu Leu Pro Thr Cys Met Tyr Arg Leu Pro Asn Val His Gly 
20 25 30 

Arg ser Tyr Gly Pro Ala Pro Gly Ala Gly His Val Gin Glu Glu Ser 
35 40 45 

Asn Leu Ser Leu Gin Ala Leu Glu Ser Arg Gin Asp Asp lie Leu Lys 

50 55 60 

Arg Leu Tyr Glu Leu Lys Ala Ala Val Asp Gly Leu ser Lys Met He 
65 70 75 80 

Gin Thr Pro Asp Ala Asp Leu Asp Val Thr Asn He He Gin Ala Asp 
65 90 95 

Glu Pro Thr Thr Leu Thr Thr Asn Ala Leu Asp Leu Asn Ser Val Leu 
100 105 110 

Gly Lys Asp Tyr Gly Ala Leu Lys Asp He Val He Asn Ala Asn Pro 

115 120 125 

Ala Ser Pro Pro Leu Ser Leu Leu Val Leu His Arg Leu Leu Cys Glu 
130 135 140 

His Phe Arg Val Leu ser Thr val His Thr His Ser ser val Lys Ser 

145 150 155 160 

Val Pro Glu Asn Leu Leu Lys Cys Phe Gly Glu Gin Asn Lys Lys Gin 
165 170 175 

Pro Arg Gin Asp Tyr Gin Leu Gly Phe Thr Leu He Trp Lys Asn Val 
180 185 190 

Pro Lys Thr Gin Met Lys Phe Ser He Gin Thr Met Cys Pro He Glu 
195 200 205 

Gly Glu Gly Asn He Ala Arg Phe Leu Phe Ser Leu Phe Gly Gin Lys 
210 215 220 

His Asn Ala Val Asn Ala Thr Leu He Asp Ser Trp Val Asp He Ala 
225 230 235 240 

He Phe G?.n Leu Lys Glu Gly S«r Ser Lys Glu Lys Ala Al.i Val Phe 
245 250 255 

Arg Ser Met Asn Ser Ala Leu Gly Lys Ser Pro Trp Leu Ala Glv Asn 
260 265 270 

Glu Leu Thr Val Ala Asp Val Val Leu Trp Ser Val Leu Gin Gin He 
275 280 285 
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Gly Gly Cys Ser Val Thr Val Pro Ala Asn Val Gin Arg Trp Met Arg 
290 295 300 

Ser Cys Glu Asn Leu Ala Pro Phe 
305 310 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH t 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSi double 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: OKA (genomic) 

(iii) HYP OT HET ICAL* NO 

(iv) ANTI-SENSE : NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY* mRNA 

(B) LOCATION: complement (1.-900) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



ACACOOGGCC 


AATTTCTGTA 


TTTTTAOTAG 


ACACGAGGTT 


TTACCATGTT 


GCCCAGGCTA 


60 


GTCTOGAACT 


CCTGACCTCA 


GGTGATCCGC 


CCGCCTOGGC 


CTCCCAAAGT 


GCTGGGATTA 


120 


CAGCCGTGAG 


CCACGGCGCC 


CGGCCTOGAT 


AAATCTTTTA 


AAAGATAAAA 


GTCTGAGTGA 


180 


GTCCCTGGCC 


GGCCGGCACA 


GATGCCGGGG 


TGGGGCCGTG 


AACCGGTTGG 


GACGCGCTCG 


240 


CTOCGGCCTG 


GGGGGACCCG 


GGCCAGCAGC 


CGGTCGCCGC 


GCGTGCGCAC 


TGGGCGGGGG 


300 


GCCCOGCCCT 


CCTACCTGCA 


CGTGGCCAGG 


CCCGGCGCTG 


OGCCGTAGCT 


CCTGCOGTGC 


360 


ACGTTGGGGA 


GCCGGTACAT 


OGAGGTGGGA 


AGCTCCACAC 


GGAGAGGCGC 


GCCGCCCCCG 


420 


TGATAGGGCT 


TTACCTGGTA 


CATCGGCATG 


GCAGAACCAA 


AGCAAAAGGG 


GGTAGCGOGT 


480 


GCCAAAGGCC 


AACGCTCAGA 


AACCGTCAGA 


GGTCACGACG 


GAGACCGGCC 


ACCTCCCTTC 


540 


TGACCCTGCT 


GCGGGCGTTC 


GGGAAAACGC 


AGTCCCGTGT 


GCTCTGATTG 


GCCCAGGCCC 


600 


TTTGACGTCA 


CGAAGTCGAC 


CTTTGACAGA 


GCCAATAGGC 


GAAAAGGAGA 


GACGGGAAGT 


660 


ATTTTTCCCC 


CCCCGCCCGG 


AAAGGGTGGA 


GCACAACGTC 


GAAAGCAGCC 


AATGGGAGTT 


720 


CAGGAGGCGC 


AGCGCCTGTG 


GGAGCCCTCG 


AGGGAACTTT 


CCCAGTCCCC 


GAGGCGGATC 


780 


GGGTGTTGCA 


TCCATGGAGC 


CACCTCACAG 


CTCGAGCTGA 


GCGGGCCTCO 


CAGTCTTCCG 


840 


GTGTCCCCTC 


TCGCGCGCCC 


TCTTTGAGAC 


CCACGGCATT 


CCAACCTCCC 


TGGAAATGGG 


900 
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CLAMS 

1. a segment of cDNA consisting of the nucleotide sequence shown 
in Figure 2. 

2. A vector comprising the segment of DNA of claim 1. 

3. A host cell which comprises the vector of claim 2. 

4. A composition consisting essentially of a protein consisting of the 
amino acid sequence shown in Figure 2. 

5. A composition of protein JTV1 as shown in Figure 1, wherein said 
composition is free of other human proteins* 

6. A segment of cDNA which encodes the amino acid sequence of 
JTV1 protein shown in Figure 2. 

7. A cDNA probe wherein said cDNA consists of between 15 and 1176 
contiguous nucleotides of the sequence shown in SEQ ED NO:l. 
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